Contact tracing is a tried-and-true public health method for exploring how infectious diseases spread. The concept is simple – trace the contacts of anyone who tests positive for an infectious disease so that you can monitor for symptoms in those contacts and mitigate the spread of disease. This application is designed to facilitate contact tracing on visit-based data with a specific focus on providers of community-based healthcare. In a community-based healthcare system, patients typically are homebound and are visited at home by care providers. This means that direct contact occurs between patients and visit staff in a community-based healthcare system, but there is no direct patient-to-patient contact. On the other hand, hospitals and clinics are examples of facility-based healthcare settings, where direct patient-to-patient, patient-to-staff, and staff-to-staff contact can happen.
The VisitContactTrace application allows users to load their own visit data in order to:
explore how infectious disease can spread within a visit-based service delivery model if appropriate precautions are not in place;
conduct visit-based contact tracing of the primary, secondary, and tertiary contacts of any patient or visit staff member whose disease status is available to you.
Important disclaimer: this application does not suggest causality or confirm disease transmission routes. Rather, it provides a means to explore how infectious disease may spread exponentially among patients and visit staff if precautions are not put into place in a visit-based service delivery model such as a community-based healthcare setting.
The VisitContactTrace application was created during the COVID-19 pandemic by the Data Science team at the Visiting Nurse Service of New York in order to support the organization’s contact tracing efforts. This application can be used for visit-based contact tracing of any infectious disease. Our hope is that VisitContactTrace is useful to agencies providing community-based healthcare and to other organizations that have visit-based service delivery models.
Learn more about VNSNY’s COVID-19 response here.
Requirements for the VisitContactTrace Application
Installing the VisitContactTrace R package
Running VisitContactTrace on Your PC
Input Data Type and Structure
Data Specifications
Using the VisitContactTrace Application
Try VisitContactTrace with Demonstration Feature
Importing Data
Exit/Reload data
Querying VisitContactTrace
The Output/Results
Output - Contact Lists
Output - Plot
Output - Visit Details
Other Useful R Functions/Objects (for experienced R users)
Help Getting Started with R
License
Acknowledgments
VisitContactTrace is an R package that requires R, an open-source software, to be installed. For more information about R, visit the R Project for Statistical Computing. If you do not have R installed on your computer or do not have much experience with R, go to Help Getting Started with R before proceeding to the next section.
The VisitContactTrace application allows users to upload data manually. For example, a user may have access to a data extract from a standard report of service encounters from their organization’s electronic medical record system. The user can save this data file as an .xlsx or .csv file and upload it to the VisitContactTrace application. More sophisticated R users can adapt the application’s source code to read in datasets created from an ETL tool or incorporate the code into a data workflow. More on the data specifications
The following code must be run in R the first time you use VisitContactTrace (and anytime you switch versions of R). This step may take a while to run as there are lot of other packages that need to be downloaded and installed before VisitContactTrace can run successfully. Copy and paste the following lines of code (preserving the upper- and lower- case letters) into the R Console and press “enter” on the keyboard to install the development version of VisitContactTrace from GitHub:
depend.pack <- c('anytime', 'shiny', 'shinydashboard', 'randomcoloR', 'shinyFiles', 'shinycssloaders', 'shinyWidgets', 'data.table', 'assertthat', 'dplyr', 'purrr', 'rmarkdown', 'visNetwork', 'DT', 'fst', 'stringr', 'shinyalert', 'epicontacts', 'fs', 'readxl', 'shinyjs')
install.packages(depend.pack, dependencies=TRUE, repos="http://lib.stat.cmu.edu/R/CRAN/")
# VNSNY Internal Employees Only (Remove before making public)
install.packages("http://stats.vnsny.org/VisitContactTrace/VisitContactTrace_0.1.0.tar.gz",repo=NULL,type="source")
# Public version install
# install.packages("VisitContactTrace", repos = "https://github.com/vnsny-bia/VisitContactTrace")
You will know that the packages are installed and that R is ready for the next command when you see the carat prompt in the R console:
>
You can proceed to running VisitContactTrace once you see the carat prompt appear.
Type the following commands (preserving the upper- and lower- case letters) into the R Console and press “enter” in order to start the application:
library(VisitContactTrace)
VisitContactTrace()
Run those two commands from an R session every time you want to use VisitContactTrace.
The VisitContactTrace application supports a common data structure used in community-based healthcare settings for functions such as billing and clinical record documentation. This data structure, known as “encounter data” or “visit data,” was the motivation for creating this application. In a community-based healthcare setting, patients are usually homebound or have significant disability, and are not observed to encounter each other. The VisitContactTrace application uses only these visit interactions or “encounters” between visit staff and patients to trace the possible transmission route of an infectious disease in a visit-based service delivery model. While it is possible for community-based visit staff to interact with each other in the field under certain circumstances, it is an uncommon occurrence, and VisitContactTrace currently does not support contact tracing for those interactions. The concept of visit-based contact tracing can be used in other visit-based service delivery models outside of community-based healthcare settings.
The image below shows a snippet of an example dataset where a handful of clinicians have delivered visits to a few patients during an observation window of February - May 2020. In this simulated sample dataset, Patient 4 was first visited by Anna Caroline Maxwell on February 27, 2020, followed by several visits by Lillian Wald every 2-6 days from February 29, 2020 to March 31, 2020.
The VisitContactTrace application will not produce accurate results if there are any data integrity or completeness issues. Please take the following into consideration when preparing a data file to upload into the application:
There may be other precautions necessary that the authors of VisitContactTrace have not anticipated. Please be thoughtful about other considerations relevant to your organization when uploading a dataset into the application.
The VisitContactTrace application recognizes the following data fields:
| Column Name | Format | Required | Description |
|---|---|---|---|
| PATIENT_ID | Character | FALSE | Unique identifier of patient. If this column is absent, PATIENT_NAME is used instead. |
| PATIENT_NAME | Character | TRUE | First and last name of patient.* If the PATIENT_ID column is absent, this column is used as the unique identifier for patients. |
| VISIT_DATE | DATE | TRUE | The date that a visit staff member visits a patient. Acceptable date formats include ‘2004-03-21 12:45:33.123456’, ‘2004/03/21 12:45:33.123456’, ‘20040321 124533.123456’, ‘03/21/2004 12:45:33.123456’, ‘03-21-2004 12:45:33.123456’, ‘2004-03-21’, ‘20040321’, ‘03/21/2004’, ‘03-21-2004’, ‘20010101’ |
| STAFF_ID | Character | FALSE | Unique ID for visit staff member. If this column is absent, STAFF_NAME is used instead. |
| STAFF_NAME | Character | TRUE | First and last name of visit staff member*. If the STAFF_ID column is absent, this column is used as the unique identifier for visit staff members. |
| PATIENT_STATUS | Character | FALSE | Labels used to indicate a status for each patient, such as confirmation of an infectious disease or some other status (e.g. “POSITIVE”, “NEGATIVE”, “SUSPECTED”). This label is case-sensitive (meaning that “Positive”, “positive”, and “POSITIVE” are all considered different statuses) and must be applied to all applicable visit observations for the patient. See the Output - Plot section to learn how the application uses this column. |
| STAFF_STATUS | Character | FALSE | Labels used to indicate a status for each staff member, such as confirmation of an infectious disease or some other status (e.g. “POSITIVE”, “NEGATIVE”, “SUSPECTED”). This label is case-sensitive (meaning that “Positive”, “positive”, and “POSITIVE” are all considered different statuses) and must be applied to all applicable visit observations for the staff member. See the Output - Plot section to learn how the application uses this column. |
* Many users may work with data systems that store patient/staff name in two columns (first name & last name). Those users should consider concatenating those columns prior to uploading the data into the application.
The columns in the dataset can be in any order. However, PATIENT_NAME, STAFF_NAME, and VISIT_DATE are required columns and must be spelled exactly as specified. The VisitContactTrace application will ignore any columns names that do not exactly match those documented here. It is highly recommended that PATIENT_ID and STAFF_ID are derived from a data source that treats these fields as a unique key - i.e., that these columns uniquely identify specific patients and staff members. When PATIENT_ID and STAFF_ID are provided, the application relies on the integrity of these fields in order to produce accurate contact tracing. If either of these columns are not available, the application will use the PATIENT_NAME and STAFF_NAME columns to uniquely identify a patient or staff member, respectively. Thus, in the absence of the PATIENT_ID and STAFF_ID columns, users should be careful to: * address inconsistencies in spelling, use of upper- and lower- case letters, use of extraneous spaces, and the order of first and last names for the names contained in PATIENT_NAME and STAFF_NAME. For example, “Lillian Wald”, “lillian wald”, “Wald, Lillian”, and “Lillian Wald” (with 2 spaces between first and last name instead of one) would all be treated as different individuals. Similarly, “Hazel Johnson-Brown” and “Hazel Johnson Brown” (not hyphenated) would be treated as different individuals as well. * ensure that patients or staff with common names are represented differently in the dataset. For example, if two different patients are named “John Doe”, then the patient names should be distinct in some way (e.g. “John Doe DOB 2/3/1950” and “John Doe DOB 4/26/1933”).
The user interface for uploading data will raise an error if the user attempts to submit a data file without the required columns. The user interface allows users the option to rename columns with the correct spelling.
The following figure is the welcome screen that appears as soon as the application opens.
If you don’t have data available but would like to experiment with the application, you can try out the simulated dataset that comes with the VisitContactTrace application. Click on the “Try Out Demo Data” button on the bottom right hand corner of the welcome screen.
If you do have data you would like to use, click on “Choose Data File” and browse to the dataset that you wish to import into the VisitContactTrace application.
The “View Selected File” button provides a preview of the data import and the ability to rename columns to the names defined in data specifications. If column names and formats are correct, the “Use Selected File” button will import the data into the application. If not, the user will be notified of an error.
When using the VisitContactTrace application, the user needs to identify an individual that serves as the “index” person in a contact tracing investigation. This application was designed assuming that the user has a list of individuals (patients and/or staff members) to conduct contact tracing on; the application itself does not inform the user as to which individuals need contact tracing.
Querying Parameter Instructions:
Click on the “Run” button.
Please note that depending on your PC’s hardware, the size of your dataset, and the duration of time for which you are conducting the contact trace you may experience long computation times.
The algorithm behind the VisitContactTrace application first identifies the primary visit-based contacts of the index person during the specified window of time. It proceeds to identify the visit-based contacts two to three orders of separation away from the index person. These visit-based contacts must have occurred after the primary contact visit dates (and tertiary contacts must occur after the secondary contact visits).
In the screenshot above, Florence Nightingale has been selected as the index staff person for visit-based contact tracing of a novel infectious disease. In this hypothetical example, her symptom onset date was May 12, 2020 and is used as the reference date. The contact tracing is set to start 7 days prior to that date (to account for a 7-day incubation period of the novel infectious disease) and will conclude 7 days afterwards (in order to account for visits that she delivered while she was symptomatic as well as to capture a longer timeframe for secondary and tertiary visits to have occurred). The calculated begin and end dates for the contact tracing is presented back to the user immediately below the parameter input area: “All visits during 2020-05-05 and 2020-05-19 will be shown.”
The right-hand panel of the application displays the primary, secondary, and tertiary contact lists (available in the three tabs under “Contact Lists.” The user can download these lists into .csv by clicking on the “Download” button.
The “Plot” tab displays the “network diagram” of primary, secondary, and tertiary contacts. Users can hover over the patient and staff icons to see individual details (ID, name, status) or click on an icon in order to highlight the direct contacts of an individual. If the user included patient/staff statuses, the plot legend displays each distinct status type differently. The application applies the most recent status for each patient or staff within the requested visit window. For example, imagine that Patient A has a status of “NEGATIVE” for visits on 5/1, 5/2, 5/3, and 5/4, and then a status of “POSITIVE” for visits on 5/5 and 5/6. If the requested window of visits for Patient A ends on 5/4, then Patient A will be labeled as “NEGATIVE” in the plot. However, if the requested window of visits for Patient A ends on 5/5 or 5/6, then Patient A will be labeled as “POSITIVE” in the plot.
The visit details tab includes all primary, secondary, and tertiary contact visit details together and can be downloaded into .csv. If the input dataset included patient/staff statuses, this tab shows the statuses that correspond to each patient/staff member on each given visit date. In the example shown below, STAFF_ID 1 was the primary contact to patients with PATIENT_IDs 1043 and 1047.
The top right-hand corner of the application contains a drop down menu that contains options for users to exit the application or reload the user interface to upload a new dataset. It is best to exit the application by clicking on “Exit” in this window, because this correctly closes the VisitContactTrace application from the R session.
The VisitContactTrace R package includes a sample simulated Home Healthcare Visits dataset (visitshc.RData) that users can explore for experimentation and instructional purposes.
head(visitshc, 10)
More experienced R users may want to access the contact tracing R function directly. The getContacts function returns a dataframe of primary, secondary, and tertiary contacts when the following parameters are specified: the name of a visit-based patient-staff encounter data file, the ID of the index patient/staff, reference date, and number of days forward/back.
# The Below example will produce contact tracing lists based on staff id.
getContacts(staff_id= '1',
patient_id = NA,
reference_date = "2020-03-01",
look_forward_days = 20,
look_back_days = 3,
data= hcvisits,
plot=FALSE)
VisitContactTrace is an R package that requires R software to be installed. To learn more, please visit the R Project for Statistical Computing. You will be asked to choose a CRAN mirror, also available here. Choose any location as the CRAN mirror, as it does not matter which one you select. Choose the correct operating system (OS) and download the relevant installation file. See more OS tips below.
If the R installation is successful a desktop shortcut for R should appear. Click on that shortcut to open the R application.
After selecting the CRAN mirror and correct OS, click on “base,” and click on “Download R.#.#.#”. Here is an abbreviated video to demonstrate the steps.